perm filename DBL1.TEX[PEG,DBL] blob
sn#506013 filedate 1980-04-22 generic text, type C, neo UTF8
COMMENT ā VALID 00013 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00003 00002 \init{
C00004 00003 \partbegin{Part One}
C00008 00004 \chapbegin{1} % Here beginneth Chapter 1
C00012 00005 \subsectionbegin[1.1]{Detour: Analysis of a Discovery}
C00016 00006 \subsectionbegin[1.2]{What AM Does: Syntheses of Discoveries}\par
C00023 00007 \subsectionbegin[1.3]{Results}
C00030 00008 \subsectionbegin[1.4]{Conclusions}
C00032 00009 \sectionbegin[2]{VIEWING AM AS SOME COMMON PROCESS}
C00034 00010 \subsectionbegin[2.1]{AM as Hill-Climbing}
C00038 00011 \subsectionbegin[2.2]{AM as Heuristic Search}
C00052 00012 \subsectionbegin[2.3]{AM as a Mathematician}
C00063 00013 \subsectionbegin[2.4]{AM as Half a Book}\par
C00068 ENDMK
Cā;
\init{
\input cmpdes
\input ammacs
\def\draft{T}
\titlepage
\newpage
}
\partbegin{Part One}
\parttitle{AM: DISCOVERY IN MATHEMATICS AS HEURISTIC SEARCH}
\rjustline{{\:A AM: Discovery}}
\rjustline{{\:A in Mathematics}}
\rjustline{{\:A as Heuristic Search}}
\runningrighthead{INTRODUCTION}
\vskip 8pc
\epigraph{Indeed, you can build a machine to draw demonstrative
conclusions for you, but I think you can never build a machine
that will draw plausible inferences.}
\author{P\'olya}
\source{Mathematical Discovery}
\epigraphend
\noindent\!
This part describes a program, called ``AM,'' which models one aspect of
elementary mathematics research: developing new concepts under the
guidance of a large body of heuristic rules. ``Mathematics'' is
considered as a type of intelligent behavior, not merely as a finished
product.
The local heuristics communicate via an agenda mechanism, a global
list of tasks for the system to perform and reasons why each task is
plausible. A single task might direct AM to define a new concept, or
to explore some facet of an existing concept, or to examine some
empirical data for regularities, etc. Repeatedly, the program
selects from the agenda the task having the best supporting reasons,
and then executes it.
Each concept is an active, structured knowledge module. A hundred
very incomplete modules are initially provided, each one
corresponding to an elementary set-theoretic concept (\eg, union).
This provides a definite but immense ``space'' which AM begins to
explore, guided by a corpus of 250 heuristic rules. AM extends its
knowledge base, ultimately rediscovering hundreds of common concepts
(\eg, numbers) and theorems (\eg, unique factorization).
\vfill
\eject\topofpage
\chapbegin{1} % Here beginneth Chapter 1
\chaptertitle{OVERVIEW}
\rjustline{{\:B Overview}}
\runningrighthead{INTRODUCTION}
\vskip 14pc
\epigraph{We need a super-mathematics in which the operations are as unknown as
the quantities they operate on, and a super-mathematician, who does
not know what he is doing when he performs these operations.}
\author{Eddington}
\epigraphend
\sectionbegin[1]{FIVE-PAGE SUMMARY OF THE PROJECT}
Scientists often face the difficult task of formulating nontrivial
research problems which are solvable. In any given branch of
science, it is usually easier to tackle a specific given problem than
to propose interesting yet manageable new questions to investigate.
For example, contrast {\sl solving} the Missionaries and Cannibals
problem with the more ill-defined reasoning which led to
{\sl inventing} it.
This half of the book is concerned with creative theory formation in
mathematics: how to propose interesting new concepts and plausible
hypotheses connecting them. The experimental vehicle of my research
is a computer program called \ffootnote{{\AM}.}{The original meaning of this
mnemonic has been abandoned. As Exodus states: {\sl I {$\underline{AM}$} that I
${\underline{AM}}$. }} Initially, {\AM} is
given the definitions of 115 simple set-theoretic concepts (like
``Delete'' or ``Equality''). Each concept is represented internally as a
data structure with a couple dozen slots or facets (like
``Definition,'' ``Examples,'' ``Worth''). Initially, most facets of most
concepts are blank, and {\AM} uses a collection of 250 heuristics---\!
plausible rules of thumb---for guidance, as it tries to fill in
those blanks. Some heuristics are used to select which specific facet
of which specific concept to explore next, while others are used to
actually find some appropriate information about the chosen facet.
Other rules prompt {\AM} to notice simple relationships between known
concepts, to define promising new concepts to investigate, and to
estimate how interesting each concept is.
\subsectionbegin[1.1]{Detour: Analysis of a Discovery}
Before discussing how to {\sl synthesize} a new theory, consider
briefly how to {\sl analyze} one, how to construct a plausible chain of
reasoning which terminates in a given discovery. One can do this by
working backwards, by reducing the creative act to simpler and
simpler creative acts. For example, consider the concept of prime
numbers. How might one be led to define such a notion? Notice the
following plausible strategy:
\extractbegin
``If $f$ is a function which transforms elements of
$A$ into elements of $B$, and $B$ is ordered, then consider just those members
of $A$ which are transformed into {\sl extremal} elements of $B$.
This set is an interesting subset of $A$.''
\extractend
When $f(x)$ means ``divisors of $x$,'' and the ordering is ``by length,''
this heuristic says to consider those numbers which have
a \ffootnote{minimal}{The
other extreme, numbers with a {\sl maximal} number of factors, was also
proposed by {\AM} as worth investigating. This led {\AM} to many
interesting questions.}\ number of factors---that is, the primes.
So this rule actually {\sl reduces} our task
from ``proposing the concept of prime numbers'' to the more elementary
problems of ``discovering ordering-by-length'' and ``inventing
divisors-of.''
But suppose we know this general rule: {\bf``If $f$ is an interesting
function, consider its inverse.''} It reduces the task of discovering
divisors-of to the simpler task of discovering \footnote{multiplication.}{Plus
noticing that multiplication is associative and commutative. } Eventually,
this task reduces to the discovery of very basic notions,
like substitution, set-union, and equality. To explain how a given
researcher might have made a given discovery, such an analysis is
continued until that inductive task is reduced to ``discovering''
notions which the researcher already knew, which were his conceptual
primitives.
\subsectionbegin[1.2]{What AM Does: Syntheses of Discoveries}\par
\epigraph{This leads to the paradox that the more original a discovery the
more obvious it seems afterwards.
The creative act is not an act of creation in the sense of the Old Testament.
It does not create something out of nothing; it uncovers, selects, re-shuffles,
combines, synthesizes already existing facts, faculties, skills.
The more familiar the parts, the more striking the new whole.}
\author{Koestler}
\epigraphend
\noindent Suppose a large collection of these heuristic strategies has been
assembled (\eg, by analyzing a great many discoveries, and writing
down new heuristic rules whenever necessary). Instead of using them
to {\sl explain} how a given idea might have evolved, one can imagine
starting from a basic core of knowledge and ``running'' the heuristics
to {\sl generate} new concepts. We're talking about reversing the
process described in the last section: not how to {\sl explain}
discoveries, but how to {\sl make} them.
Such syntheses are precisely what {\AM} does. The program consists of a
large corpus of primitive mathematical concepts, each with a few
associated \ffootnote{heuristics.}{Situation/action rules which function as
local ``plausible move generators.'' Some suggest tasks for the system
to carry out, some suggest ways of satisfying a given task, etc.}
{\AM}'s activities all serve to expand {\AM} itself, to enlarge upon a
given body of mathematical knowledge. To cope with the enormity of
the potential ``search space'' involved, {\AM} uses its heuristics as
judgmental criteria to guide development in the most promising
direction. It appears that the process of inventing
worthwhile \footnote{new}{Typically, ``new'' means new to {\AM}, not to human beings;
and ``worthwhile'' can be judged only with hindsight.} concepts can be guided
successfully using a collection of a few hundred such heuristics.
Each concept is represented as a frame-like data structure with 25
different facets or slots. The types of facets include: {\bf Ex\-am\-ples,
De\-fi\-ni\-tions, Ge\-ne\-rali\-za\-tions, Do\-main/Range, Ana\-lo\-gies,
In\-terest\-ing\-ness}, and many others. Modular representation of
concepts provides a convenient scheme for organizing the heuristics;
for example, the following strategy fits into the {\bf Ex\-am\-ples} facet
of the {\bf Pre\-di\-cate} concept: {\it ``If, empirically, 10 times as many
elements {\bf fail} some predicate P, as {\bf satisfy} it, then some
{\bf generali\-zation} (weakened version) of P might be more interesting
than P.''} {\AM} considers this suggestion after trying to fill in
examples of each \footnote{predicate.}{In fact, after {\AM} attempts to find
examples of SET-EQUALITY, so few are found that {\AM} decides to
generalize that predicate. The result is the creation of
a new predicate which means
``Has-the-\-same-length-as''---\ie, a rudimentary precursor to natural
numbers.}
{\AM} is initially given a collection of 115 core concepts, with only a
few facets filled in for each concept. Its sole activity is to choose some
facet of some concept, and fill in that particular slot. In so
doing, new notions will often emerge. Uninteresting ones are
forgotten, mildly interesting ones are kept as parts of one facet of
one concept, and very interesting ones are granted full
concept-module status. Each of these new modules has dozens of blank
slots; hence the space of possible actions (blank facets to fill in)
grows rapidly. The same heuristics are used both to suggest new
directions for investigation and to limit attention: both to sprout
and to prune.
\subsectionbegin[1.3]{Results}
The particular mathematical domains in which {\AM} operates depend upon
the choice of initial concepts. Currently, {\AM} begins with nothing
but a scanty knowledge of concepts which Piaget might describe as
{\sl prenumerical}: Sets, substitution, operations, equality, and so
on. In particular, {\AM} is not told anything about proof,
single-valued functions, or numbers.
From this primitive basis, {\AM} quickly \ffootnote{discovered}{``Discovering'' a
concept means that (1) {\AM} recognized it as a distinguished entity
(\eg, by formulating its definition) and also (2) {\AM} decided it was
worth investigating (either because of the interesting way it was
formed, or because of surprising preliminary empirical results).}
elementary numerical concepts (corresponding to those we refer to as
natural numbers, multiplication, factors, and primes) and wandered
around in the domain of elementary number theory. {\AM} was not designed
to {\sl prove} anything, but it did {\sl conjecture} many well-known
relationships (\eg, the unique factorization theorem).
{\AM} was not able to discover any ``new-to-Mankind'' mathematics purely
on its own, but {\sl has} discovered several interesting notions
hitherto unknown to the author. A couple bits of new mathematics have
been {\sl inspired} by {\AM}. A synergetic {\AM}--human combination can
sometimes produce better research than either could \footnote{alone.}{This is
supported by Gelernter's experiences with his geometry program: while
lecturing about how it might prove a certain theorem about isosceles
triangles, he came up with a new, cute proof. Similarly, Guard and
Eastman noticed an intermediate result of their SAM resolution
theorem prover, and wisely interpreted it as a nontrivial result in
lattice theory (now known as SAM's lemma).} Although most of the
concepts {\AM} proposed and developed were already very well known, {\AM}
defined some of them in novel ways (\eg, prime pairs were defined by
restricting addition to primes; that is, for which primes $p,q,r$ is it
possible that $p+q=r$\footnote{?}{The answer is that either $p$ or $q$ must be 2,
and that the other two primes are a prime pair---\ie, they differ
by two. }).
Everything that {\AM} does can be viewed as testing the underlying body
of heuristic rules. Gradually, this knowledge becomes better
organized, its implications clearer. The resultant body of detailed
heuristics may be the germ of a more efficient programme for
educating math students than current \footnote{dogma.}{Currently, an
educator takes the very best work any mathematician has ever done,
polishes it until its brilliance is blinding, then presents it to the
student to induce upon. Many individuals (\eg, Knuth and Polya) have
pointed out this blunder. A few (\eg, Papert at MIT, Adams at
Stanford, Seely-Brown and Goldstein at Xerox-Parc)
are experimenting with more realistic strategies for
``teaching'' creativity.}
Another benefit of actually constructing {\AM} is that of
{\sl experimentation\/}: one can vary the concepts {\AM} starts with, vary
the heuristics available, etc., and study the effects on {\AM}'s
behavior. Several such experiments were performed. One involved
adding a couple dozen new concepts from an entirely new domain: plane
geometry. {\AM} busied itself exploring elementary geometric concepts,
and was almost as productive there as in its original domain. New
concepts were defined, and new conjectures formulated. Other
experiments indicated that {\AM} was more robust than anticipated; it
withstood many kinds of ``de-tuning.'' Others demonstrated the
tremendous impact that a few key concepts (\eg, Equality) had on
{\AM}'s behavior. Several more experiments and extensions have been
planned for the future.
\subsectionbegin[1.4]{Conclusions}
{\AM} is forced to judge {\it a priori} the value of each new concept, to
lose interest quickly in concepts which aren't going to develop into
anything. Often, such judgments can only be based on hindsight. For
similar reasons, {\AM} has difficulty formulating new heuristics which
are relevant to the new concepts it creates. Heuristics are often
merely compiled hindsight. While {\AM}'s ``approach'' to empirical
research may be used in other scientific domains, the main limitation
(reliance on hindsight) will probably recur. This prevents {\AM} from
progressing indefinitely far on its own.
This ultimate limitation was reached. {\AM}'s performance degraded more
and more as it progressed further away from its initial base of
concepts. Nevertheless, {\AM} demonstrated that selected aspects of
creative discovery in elementary mathematics could be adequately
represented as a heuristic search process. Actually constructing a
computer model of this activity has provided an experimental vehicle
for studying the dynamics of plausible empirical inference.
\sectionskip
\sectionbegin[2]{VIEWING AM AS SOME COMMON PROCESS}
This section will provide a few metaphors: some hints for squeezing
{\AM} into paradigms with which the reader might be familiar. For
example, the existence of heuristics in {\AM} is functionally the same
as the presence of domain-specific information in any knowledge-based
system.
Consider assumptions, axioms, definitions, and theorems to be
syntactic rules for the language that we call Mathematics. Thus
theorem-proving, and the whole of textbook mathematics, is a purely
syntactic process. Then the heuristic rules used by a mathematician
(and by {\AM}) would correspond to the semantic knowledge associated
with these more formal methods.
Just as one can upgrade natural-language-understanding by
incorporating semantic knowledge, so {\AM} is only as successful as the
heuristics it knows.
Four more ways of ``viewing'' {\AM} as something else will be provided:
({\it i\/}) {\AM} as a hill-climber, ({\it ii\/}) {\AM} as a
heuristic search program,
({\it iii\/}) {\AM} as a mathematician, and ({\it iv\/}) {\AM} as half a book.
\subsectionbegin[2.1]{AM as Hill-Climbing}
Let's draw an analogy between the process of
developing new mathematics and the familiar
process of hill-climbing. We may visualize {\AM} as exploring a space
using a measuring or ``evaluation'' function which imparts to it a topography.
Consider {\AM}'s core of very simple knowledge. By compounding its
known concepts and methods, {\AM} can explore beyond the frontier of
this foundation a little
wherever it wishes. The incredible variety of alternatives to
investigate includes all known mathematics, much trivia, countless
dead ends, and so on. The only ``successful'' paths near the core are
the narrow ridges of known mathematics (plus perhaps a few
as-yet-undiscovered isolated peaks).
How can {\AM} walk through this immense space with any hope of
following the few, slender trails of already-established
mathematics (or some equally successful new fields)? {\AM} must do
hill-climbing: as new concepts are formed, decide how promising they
are, and always explore the currently most-promising new concept. The
evaluation function is quite nontrivial, and the work reported
in this half of the book may be
viewed as an attempt to study and explain and duplicate the
judgmental criteria people employ. Preliminary \ffootnote{attempts}{These took
the form of informal simulations. Although far from controlled
experiments, they indicated the feasibility of attempting to create {\AM},
by yielding an approximate figure for the amount of informal knowledge
such a system would need.} at codifying such
``mysterious'' emotive forces as intuition, aesthetics, utility,
richness, interestingness, relevance\ellipsis indicated that a large but
not unmanageable collection of heuristic rules should suffice.
The important visualization to make is that with proper evaluation
criteria, {\AM}'s planar mass of interrelated concepts is transformed
into a three-dimensional relief map: the known lines of development
become mountain ranges, soaring above the vast flat plains of trivia
and inconsistency below.
Occasionally an isolated hill is discovered near the \footnote{core;}{For example,
Conway's numbers, as described in [Knuth 74].}
certainly whole ranges lie undiscovered for long periods
of \footnote{time,}{For example, non-Euclidean geometries weren't
thought of until 1848.} and
the terrain far from the initial core is not yet explored at all.
\subsectionbegin[2.2]{AM as Heuristic Search}
We earlier referred to {\AM} as a
kind of ``heuristic search'' program. That must mean that {\AM} is
exploring a particular ``space,'' using some informal evaluation
criteria to guide it.
The flavor of search which is used here is that of progressively
enlarging a tree. Certain ``evaluation-function'' heuristics are used
to decide which node of the tree to expand next, and other guiding
rules are then used to produce from that node a few interesting
successor nodes. To do mathematical research well, I claim that it is
necessary and sufficient to have good methods for proposing new
concepts from existing ones, and for deciding how interesting each
``node'' (partially-studied concept) is.
{\AM} is initially supplied with a few facts about some simple math
concepts. {\AM} then explores mathematics by selectively enlarging that
basis. One could say that {\AM} consists of an active body of
mathematical concepts, plus enough ``wisdom'' to use and develop them
effectively (for ``wisdom,'' read ``heuristics''). Loosely speaking, then,
{\AM} is a heuristic search program. To see this more clearly, we must
explain what the nodes of {\AM}'s search space are, what the successor
operators or links are, and what the evaluation function is.
{\AM}'s space can be considered to consist of all nodes which are
consistent, partially-filled-in concepts. Then a primitive ``legal
move'' for {\AM} would be to ({\it i\/}) enlarge some facet of some concept, or
({\it ii\/}) create a new, partially-complete concept. Consider momentarily
the size of this space. If there were no constraint on what the new
concepts can be, and no informal knowledge for quickly finding
entries for a desired facet, a blind ``legal-move'' program would go
nowhere---slowly! One shouldn't even call the activity such a
program would be doing ``math research.''
The heuristic rules are used as little ``plausible move generators.''
They suggest which facet of which concept to enlarge next, and they
suggest specific new concepts to create. The only activities which {\AM}
will consider doing are those which have been motivated for some
specific \ffootnote{good}{Of course, {\AM} thinks a reason is ``good'' if---and
only if---it was told that by a heuristic rule; so those rules had
better be plausible, preferably the ones actually used by the
experts.} reason. A global {\sl agenda of tasks} is maintained,
listing all the activities suggested but not yet worked on.
{\AM} has a definite algorithm for rating the nodes of its space. Many
heuristics exist merely to estimate the worth of any given concept.
Other heuristics use these worth ratings to order the tasks on the
global agenda list. Yet {\AM} has no specific goal criteria: it can
never ``halt,'' never succeed or fail in any absolute sense. {\AM} goes
on \footnote{forever.}{Technically, forever is about 100,000 list cells and a
couple of cpu hours.}
Consider Nilsson's descriptions of depth-first searching and breadth-first
searching ([Nilsson 71]). He has us maintain a list of ``open'' nodes.
Repeatedly, he plucks the top one and expands it. In the process,
some new nodes may be added to the Open list. In the case of
depth-first searching, they are added at the top; the next node to
expand is the one most recently created; the Open-list is being used
as a push-down stack. For breadth-first search, new nodes are added
at the bottom; they aren't expanded until all the older nodes have
been; the Open-list is used as a queue. For heuristic search, or
``best-first'' search, new nodes are evaluated in some numeric way, and
then ``merged'' into the already-sorted list of Open nodes.
This process is very similar to the agenda mechanism {\AM} uses to
manage its search. This will be discussed in detail in Chapter
3. Each entry on the agenda consists of three parts:
({\it i\/}) a plausible task for {\AM}
to do, ({\it ii\/}) a list of reasons supporting that task, and
({\it iii\/}) a
numeric estimate of the overall priority this task should have.
When a task is suggested for some reason, it is added to the agenda.
A task may be suggested several times, for different reasons. The
global priority value assigned to each task is based on the combined
value of its reasons. The control structure of {\AM} is simply to
select the task with the highest priority, execute it, and select a
new one. The agenda mechanism appears to be a very well-suited data
structure for managing a ``best-first'' search process.
Similar control structures were used in LT [Newell, Shaw, & Simon 57],
the predictor part of Dendral [Buchanan {\etal} 69], SIMULA-67 [Dahl 68],
and KRL [Bobrow & Winograd 77].
The main difference is that in {\AM}, symbolic reasons are
used (albeit in trivial token-like ways) to decide whether---and how
much---to boost the priority of a task when it is suggested again.
There are several difficulties and anomalies in forcing {\AM} into the
heuristic search paradigm. In a typical heuristic search
(\eg, Dendral [Feigenbaum {\etal} 71], Meta-Dendral [Buchanan {\etal} 72],
most game-playing programs [Samuel 67]), a ``search space'' is defined
implicitly by a ``legal move generator.'' Heuristics are present to constrain
that generator so that only plausible nodes are produced.
The second kind of heuristic search, of which {\AM} is an example,
contains no ``legal move generator.'' Instead, {\AM}'s heuristics are used
as plausible move generators.
Those heuristics themselves implicitly define the possible tasks {\AM} might
consider, and {\sl all} such tasks should be plausible ones. In the first kind of
search, removing a heuristic widens the search space; in {\AM}'s kind of
search, removing a heuristic {\sl reduces} it.
Another anomaly is that the operators which {\AM} uses to enlarge and
explore the space of concepts are themselves mathematical concepts
(\eg, some heuristic rules result in the creation of new heuristic
rules; ``Compose'' is both a concept and an operation which results in
new concepts). Thus {\AM} should be viewed as a mass of knowledge which
enlarges {\sl itself} repeatedly. Typically,
computer programs keep the information they ``discover'' quite
separate from the knowledge they use to make \ffootnote{discoveries.}{Of course,
this is often because the two kinds of knowledge are very
different: For a chess-player, the first kind is ``good board
positions,'' and the second is ``strategies for making a good move.''
Theorem-provers are an exception. They produce
a new theorem, and then use it (almost like a new operator) in future proofs.
A program to learn
to play checkers [Samuel 67] has this same flavor, thereby indicating that
this ``self-help'' property can be used in many domains, not simply
in mathematics.}
Perhaps the greatest difference between {\AM} and typical heuristic
search procedures is that {\AM} has no well-defined target concepts or
target relationships. Rather, its ``goal criterion''---its sole aim---is
to maximize the interestingness level of the activities it
performs, the priority ratings of the top tasks on the agenda. It
doesn't matter precisely which definitions or conjectures {\AM}
discovers---or misses---so long as it spends its time on plausible
tasks. There is no fixed set of theorems that {\AM} should discover, so
{\AM} is not a typical {\sl problem-solver}. There is no fixed set of
traps {\AM} should avoid, no small set of legal moves, and no winning/losing
behavior, so {\AM} is not a typical {\sl game-player}.
For example, no stigma is attached to the fact that {\AM} never
discovered real \footnote{numbers;}{There are many ``nice'' things which {\AM}
didn't---and can't---do: \eg, devising {$\underline{geometric}$} concepts from
its initial simple set-theoretic knowledge. See the discussion of
the limitations of {\AM}, Section 7--2.2. } it
was rather surprising that {\AM} managed to discover {\sl natural}
numbers! Even if it hadn't done that, it would have
been \footnote{acceptable}{Acceptable to whom? Is there really a domain-invariant
criterion for judging the quality of {\AM}'s actions? See the
discussions in Section 7--1. } if {\AM} had simply gone off and
developed ideas in set theory.
\subsectionbegin[2.3]{AM as a Mathematician}
Before diving into the innards of {\AM}, let's take a moment to discuss
the totality of the mathematics which {\AM} carried out. Like a
contemporary historian summarizing the work of the Babylonian
mathematicians, we shan't hesitate to use current terms and criticize
by current standards.
{\AM} began its investigations with scanty knowledge of a few
set-theoretic concepts (sets, equality of sets, set operations).
Most of the obvious set-theory relations (\eg, de Morgan's laws)
were eventually uncovered; since {\AM} never fully understood abstract
algebra, the statement and verification of each of these was quite
obscure. {\AM} never derived a formal notion of infinity, but it
naively established conjectures like ``a set can never be a member of
itself,'' and procedures for making chains of new sets (``insert a set
into itself''). No sophisticated set theory (\eg, diagonalization)
was ever done.
After this initial period of exploration, {\AM} decided that ``equality''
was worth generalizing, and thereby discovered the relation
``same-size-as.'' ``Natural numbers'' were based on this, and soon most
simple arithmetic operations were defined.
Since addition arose as an analog to union, and multiplication as a
repeated substitution followed by a generalized kind
of \ffootnote{unioning,}{Take two bags A and B. Replace each element of A by the
bag B. Remove one level of parentheses by taking the union of all elements of the
transfigured bag A. Then that new bag will have as many elements as
the product of the lengths of the two original bags. } it came as
quite a surprise when {\AM} noticed that they were related (namely,
$n+n=2\times n$). {\AM} later rediscovered multiplication in three other ways:
as repeated addition, as the numeric analog of the Cartesian product
of sets, and by studying the cardinality of power \footnote{sets.}{The size of
the set of all subsets of S is 2$ā{\|S\|}$. Thus the power set
of $A\union B$ has
length equal to the {\sl product} of the lengths of the power sets of A
and B individually (assuming A and B are disjoint).} These
operations were defined in different ways, so it was an unexpected
(to {\AM}) discovery when they all turned out to be equivalent. These
surprises caused {\AM} to give the concept ``Times'' quite a high Worth
rating.
Exponentiation was defined as repeated multiplication. Unfortunately,
{\AM} never found any obvious properties of exponentiation, hence lost
all interest in it.
Soon after defining multiplication, {\AM} investigated the process of
multiplying a number by itself: squaring. The inverse of this turned
out to be interesting, and led to the definition of square-root. {\AM}
remained content to play around with the concept of
{\sl integer}-square-root. Although it isolated the set of numbers
which had no square root, {\AM} was never close to discovering
rationals, let alone irrationals.
Raising to fourth-powers, and fourth-rooting, were discovered at this
time. Perfect squares and perfect fourth-powers were isolated. Many
other numeric operations and kinds of numbers were isolated: Odds,
Evens, Doubling, Halving, etc. Primitive notions of numeric
inequality were defined but {\AM} never even discovered Trichotomy.
The associativity and commutativity of multiplication indicated that
it could accept a BAG of numbers as its argument. When {\AM} defined
the inverse operation corresponding to Times, this property allowed
the definition to be: ``any {\sl bag} of numbers ($>1$) whose product is
$x$.'' This was just the notion of factoring a number $x$.
Minimally-factorable numbers turned out to be what we call primes.
Maximally-factorable numbers were also thought to be interesting.
Prime pairs were discovered in a bizarre way: by restricting addition
(its arguments and its values) to \ffootnote{Primes.}{That is, consider the set
of triples $p,q,r$, all primes, for which $p+q=r$. Then one of them must
be ``2,'' and the other two must therefore form a prime pair. } {\AM}
conjectured the fundamental theorem of arithmetic (unique
factorization into primes) and Goldbach's conjecture (every even
number greater than 2 is the sum of two primes) in a surprisingly symmetric way.
The unary representation of numbers gave way to a representation as a
bag of primes (based on unique factorization), but {\AM} never thought
of exponential notation.\fnote{A tangential note:
All of the discoveries mentioned above were made by {\AM} working
by itself, with a human being observing its behavior. If the
level of sophistication of {\AM}'s concepts were higher (or the level
of sophistication of its users were lower), then it might be worthwhile
to develop a nice user--system interface. The user in that case
could---and ought to---work right along with {\AM} as a co-researcher.}
Since the key concepts of remainder,
greater-than, gcd, and exponentiation were never mastered, progress
in number theory was arrested.
When a new base of {\sl geometric} concepts was added, {\AM} began finding
some more general associations. In place of the strict definitions
for the equality of lines, angles, and triangles, came new
definitions of concepts we refer to as Parallel, Equal-measure,
Similar, Congruent, Translation, Rotation, plus many which have no
common name (\eg, the relationship of two triangles sharing a common
angle). An unexpected geometric interpretation of Goldbach's conjecture was
found.\fnote{Given all angles of a prime number of degrees,
$(0,1,2,3,5,7,11,\ldots,179\deg$), then any angle between 0 and 180$\deg$
can be approximated (to within 1$\deg$) as the sum of two of
those angles.} Lacking a geometry ``model'' (an analogic
representation like the one Gelernter employed), {\AM} was doomed to
failure with respect to proposing only plausible geometric
conjectures.
Similar restrictions due to poor ``visualization'' abilities would crop
up in topology. The concepts of continuity, infinity, and measure
would have to be fed to {\AM} before it could enter the domains of
analysis. More and more drastic changes in its initial base would be
required, as the desired domain gets further and further from simple
finite set theory and elementary number theory.
\subsectionbegin[2.4]{AM as Half a Book}\par
\epigraph{Walking home along a deserted street late at night, the reader may
imagine himself to feel in the small of his back a cold, hard object;
and to hear the words spoken behind him, ``Easy now. This is a
stick-up. Hand over your money.'' What does the reader do? He
attempts to generate the utterance. He says to himself, now if I were
standing behind someone holding a cold, hard object against his
back, what would make me say that? What would I mean by it? The
reader is advised that he can only arrive at the deep structure of
this book, and through the deep structure the semantics, if he
attempts to generate the book for himself. The author wishes him
luck.}
\author{Linderholm}
\epigraphend
\noindent\!
Each chapter is of roughly equal importance, which explains the huge variation
in length. Start looking over Chapter 2 right away: it contains
a detailed example of what {\AM} does. Since you're reading this sentence now,
we'll assume that you want a preview of what's to come in the rest of this
document.
Chapter 3 covers the top-level control structure of the system, which is
based on the notion of an ``agenda'' of tasks to perform.
In Chapter 4 the low-level control structure is revealed: {\AM} is really
guided by a mass of heuristic rules of varying generality.
Chapter 5 contains more than you want to know about the representation
of knowledge in {\AM}. It contains a
diagram showing some of {\AM}'s starting concepts, which is
worth a look.
Most of the results of the project are presented in Chapter 6. In addition to
simply ``running'' {\AM}, several experiments have been conducted with it.
It's awkward to evaluate {\AM}, and therefore Chapter 7 is quite long and detailed.
The appendices provide material which supplements the text.
Appendix 1 contains a description of a few of the initial concepts,
some examples of how they were coded into Lisp, and a partial list of the
concepts {\AM} defined and investigated along the way.
Appendix 2 exhibits all 253 heuristics that {\AM} is
explicitly provided with.
Appendix 3 contains traces of {\AM} in action.
This book---and its readers---must come to grips with a very
interdisciplinary problem. For the reader whose background is in
Artificial Intelligence, most of the system's actions---the
``mathematics'' it does---may seem inherently uninteresting. For the
mathematician, the word ``LISP'' signifies nothing beyond a speech
impediment (to Artificial Intelligence types it also connotes a
programming impediment). As a result, there is a good bit
of definition and explanation, and the reader's indulgence is entreated.
\worldend